ISD October 2000

Editorial
Today's News
News Archives
On-line Articles
Current Issue
Current Abstracts
Magazine Archives
Subscribe to ISD

Directories:
Vendor Guide 2000
Advertiser Index
EDA Web Directory
Literature Guide
Event Calendar

Resources:
Resources and Seminars
Special Sections
High-tech Job Search

Information:
2000 Calendar/Rates
About isdmag.com
Writers Wanted!
Search isdmag.com
Contact Us

Related Sites:
learnverilog.com

The Nuts and Bolts of Logic BIST

Logic BIST makes good sense when it isn't practical to apply test vectors at the tester interface. By Cliff Warren

There are many differing opinions on the role of logic BIST in system-on-a-chip (SOC) ASIC design. Logic BIST can test a circuit to a fault coverage of 90 percent to 95 percent or more, using only a clock and a "test-mode" signal as inputs. Outputs are also simplified, as a pass/fail test can be simplified to a single output signal. This becomes especially interesting given today's emphasis on intellectual property (IP) and SOC design. An IP block can have test circuitry built-in and ready for a user who may not be familiar with the design. Because of the simplified test interface, logic BIST holds promise to enable testing in a SOC setting, where not all inputs and outputs of a particular logic block are pinned out.

However, it's perhaps what basic logic BIST can't do that concerns many would-be users. It doesn't test parametric faults, path delay faults, IDDQ, or other AC faults. These types of tests consume the majority of the expensive automatic test equipment (ATE) time anyway. Therefore, the logic BIST'ed chip must still be tested on an expensive tester capable of testing such faults.

The type of logic BIST discussed here will only work on a design that has had scan inserted. Once scan has been inserted, however, automatic test program generation (ATPG) can create a vector set that is driven directly into the scan-inserted design, with no additional circuitry needed. ATPG test patterns typically give high fault coverage, usually slightly higher than logic BIST alone can achieve. But with the addition of test points, and no limitations on the number of scan chains, test time with logic BIST can be quite reasonable. It stands to reason that logic BIST makes sense in situations where it isn't practical to apply test vectors at the tester interface - as in SOC designs where not all pins into a particular test block are available at the interface, or when the size of the ATPG vectors is so large that it overburdens ATE.

Logic BIST building blocks

The beginning of any successful logic BIST starts with a suitable design; the design must have scan inserted. Without scan, the application of pseudo-random patterns won't work. I have run experiments using pseudo-random vectors and a non-scanned core without success. The coverage typically tops out at less than 5 percent. Without scan, deterministic (not pseudo-random) vectors would be required. It's not easy, and in fact not practical, to create deterministic vectors in hardware for most designs. For example, in one case it was determined that the logic required to create a small deterministic pattern was much larger than the device to be tested. The exception to this is in designs that are sufficiently repeatable and predictable in their function, such as multipliers and simple microprocessors.

Figure 1 - Inputs and outputs

On the output side, there are regular outputs as well as scan outputs.

With a scanned design, there are several types of input and output signals to be concerned with (see Figure 1). There are regular inputs, clocks, and resets to be dealt with. Also, the addition of scan will result in scan inputs, and a scanmode ("test_se") input.

A well-designed BIST takes over all inputs, and evaluates all of the outputs, to the device under test (DUT). One wouldn't want to leave out signals such as a clock or a reset, and then depend on a user to handle them during the BIST test.

Linear feedback shift register

The scan chain inputs become connected to a linear feedback shift register (LFSR) during the BIST test. The BIST includes mux circuitry to do this. This should normally be a "maximal length" LFSR, which is an LFSR that doesn't repeat its sequence until it has visited each and every state (except the all-zero vector). This output of the LFSR becomes the pseudo-random pattern.

It replaces the vector set normally given by ATPG. Regular inputs must also be driven by the BIST during the test. An LFSR could be used to do this, however the inputs must be sequenced from a different phase clock than the scan chain, or violations will likely occur. A better option is to drive the regular inputs from the outputs of the various scan chains.

Reset signals are important; if you connect a reset to the pseudo-random signal you will be constantly resetting the scan chains, and the potential fault coverage will most likely go down. Instead, resets should be handled directly by the BIST controller. In fact, a reset test is advisable; hold the reset active, and run a few test patterns. Then keep reset inactive for the remainder of the test.

The clock inputs become the scan chain clocks during the test. The test clock or "BIST" clock drives these clock inputs.

Evaluating outputs

The outputs must be evaluated. This is most easily done with a multiple input shift register (MISR). This register takes in a vector and mixes it with the current state of the register to make a unique result, and leave the result in the register. If any single output from the design is different from what was expected at the input to the MISR, then the state of the MISR becomes different, and stays different, such that the final result - the "signature" - will be different. With a significantly large number of bits in the MISR, the chance that a failure or group of failures could result in a correct signature are sufficiently remote. Then, after a predetermined time, the BIST must compare the value in the MISR with a known value. If the value is correct, one can be reasonably assured that the fault coverage tested for has been achieved and the design is correct to that fault coverage level. Optionally, the value in the MISR could be shifted out instead of being compared by the controller.

Using a MISR with 16 bits in the register, the chance that you can miss a failure is 1/216, or 1/65,536. In most cases however, the number of outputs that must be evaluated dictates the size of the MISR. If you have a very high number of outputs, you can do some things with the signals that need to be watched to keep the size of the register down and save gates. For example, one thing that the author terms a "two-to-one reduction" MISR is a technique where you use an XOR cell to reduce two signals into one. In this case, one would miss an error only if both signals going into the same two-input XOR were wrong at the same time. It's a tradeoff between gate count and having reasonable assurance that errors aren't being masked.

Don't clock X-states

It's absolutely essential that X-states not be clocked into the MISR. If they are, the MISR will become corrupted and signature analysis will be impossible (see Figure 2).

Figure 2 - Logic BIST architecture

The architecture includes the LFSR, MISR, controller, and DUT interconnected as described.

The final piece to logic BIST is a controller that ties the LFSR, the MISR, and the DUT together, and runs the test, determining when the test is done and checking the signature in the MISR. The controller also handles the muxing of the inputs to the test during test mode, and to normal operation during non-test mode. The scan enable (test_se) signal needs to be handled by the BIST. The BIST needs to put the device in scan mode for as many cycles as are needed to fill the longest scan chain, then go out of scan mode for one cycle, then repeat. As well as directing the other hardware pieces, the controller actually applies the test. This consists of loading the scan chains with data, handling the scan enable pin for data capture, and then unloading the scan chains. This operation follows basic scan methodology, implemented in hardware. In addition to the regular scan chain test, the controller may also direct some specific tests, such as the reset active test.

Each of these pieces, the LFSR, the MISR, and the controller can be built with a hardware description language such as Verilog or VHDL in RTL form. These blocks can be easily edited and synthesized to create a specific implementation of logic BIST. In the case of the LFSR and MISR, the author has created a library of RTL code that stores common sizes of these functions.

Table 1 - Tracking outputs

Logic Block Gate Count

DUT 1412.0

LFSR 51.4

MISR 156.2

Controller 187.1

Total BIST 394.7

Case study in logic BIST

In working through these examples, I used the following: a synthesis tool to convert RTL-level HDL code to gates, a fault-grading tool to determine the fault coverage under various configurations, a simulation tool to determine good signatures and to make sure the overall circuit works, and a good text editor. Each of the following examples was placed on a test chip along with two RAM blocks that were also BIST'ed. The entire chip was evaluated on a tester for proper BIST operation. The results were very encouraging, as I was able to test the entire chip to an estimated 95 percent fault coverage with essentially a clock and testmode signal as input (estimation made due to the estimated fault coverage of the RAM blocks).

Circuit 1 - MAC
The first design to be BIST'ed was a multiplier-accumulator (MAC), which was a combinational circuit (no memory elements). Because of this, scan insertion was not necessary. Because the MAC is combinational, it was possible to achieve 98.9 percent fault coverage with only 500 pseudo-random vectors. While the MAC had 32 inputs, it was decided that an 8-bit LFSR was sufficient to get a suitable vector set. This meant that I used each LFSR output about 4 times. There were 16 outputs from the MAC, so a 16-bit MISR was used (see Table 1). Thus the inclusion of BIST on this simple combinational circuit resulted in a 28 percent gate count increase.

Table 2 - Covering the outputs

Logic Block Gate Count

UART w/o scan 2126.2

UART w/scan 2415.7

LFSR 101.6

MISR 239.6

Controller 379.3

Total BIST 720.5

Circuit 2 - UART
Circuit 2 is a UART that takes 2,126 gates. This is a common IP block that many customers of AMI have used over the years. The circuit had five clock domains, each requiring a separate scan chain. Also, the largest scan chain was broken up into smaller chains to improve testability. There were a total of nine scan chains. After scan insertion, the DUT became 2,407 gates, an increase of 281, or 13.2 percent.A 16-bit LFSR was chosen. For a maximal length LFSR, this gives 65,535 unique patterns before repeating. (A maximal length LFSR normally doesn't come to the all-zero state.) A 24-bit MISR was needed to cover all the outputs, both scan and regular (see Table 2).

In this case, BIST adds 29.8 percent on top of the scanned design, which is fairly high. The fault coverage achieved by the BIST on this circuit was 93 percent.

Circuit 3 - small microprocessor
The third circuit was a small microprocessor requiring 9,053 gates. When scan was added, the gate count rose to 9,581. It reported one clock and a single scan chain of length 493. It has 54 regular inputs, not including the clock and reset, and 68 regular outputs.

A 23-bit LFSR was chosen. This would give many more unique vectors than needed. My goal was to have enough LFSR signals in order to eliminate the need to reuse each output too many times.

Phase shifting of the LFSR output was used in this logic BIST design in an attempt to get to the desired coverage. To illustrate the concept of phase shifting, consider the following example output of an LFSR:


001001101001110100100001
010011010011101001000010
100110100111010010000100
001101001110100100001001
011010011101001000010011

Notice how the ones (and zeros) shift diagonally through the register. When each of these is treated as an input to the DUT, that suggests a relationship between the input channels. To break this inter-channel dependence, a phase shifter can be used. After phase shifting, the output of the LFSR no longer exhibits the diagonal relationship.

To create a phase shift, one can use XOR gates between various LFSR outputs to break up the dependence. The selection of optimal phase shifting is the topic of another paper [1]. In comparing the pre-phase-shifted BIST with the phase-shifted version, fault coverage was increased by 3 percent.

Flexible circuit

This circuit proved to be quite flexible with regard to the number of scan chains. It was fully synchronous, with one common clock throughout. It was found that, in general, the insertion of extra scan chains resulted in higher fault coverage. It was eventually decided that 15 scan chains, with 30 flops per chain, would give the desired coverage while not impacting the size of the BIST to a great degree. The extra scan chains don't impact the gate count of the scan circuitry, but in the use of BIST, more scan chains mean more inputs to drive and more outputs to evaluate.

Table 3 - Breakdown of gate counts

Logic Block Gate Count

UP w/o scan 9052.6

UP w/scan 9580.5

LFSR and phase shifter 193.5

MISR 442.9

Controller 413.7

Total BIST 1050.0

This circuit required about 8,000 patterns (scan cycles) to get to approximately 91 percent fault coverage. More coverage could probably be achieved with a proper selection of test-point insertion. Test point insertion is a strategy to increase the controllability and observability of a particular block of logic. This is normally used to detect pseudo-random pattern resistant faults, or faults that aren't found without a very specific vector. For example, consider an "AND-tree" with 32 inputs. Each of these inputs must be at a logic value one to trigger a one on the output and detect a stuck-at zero on the output. This vector is one of 2^32 (or 4 Gig) pseudo-random vectors. It's not likely that one will find this vector in one's pseudo-random set. Thus we would need to modify the DUT during the BIST test to increase the controllability and observability at this node. This isn't easy to do without specialized software that can identify these node and suggest test point circuitry (see Table 3).

In this case logic BIST adds 10.9 percent to the scanned design.

More observations

It should be noted that the size of BIST circuitry isn't typically dependent on the size of the DUT. What does tend to effect the size of the BIST is the number of inputs to be driven and the number of outputs to be evaluated. In fact, the greatest factor in BIST size is the number of outputs.

Note that while the logic BIST size in example of the small microprocessor is bigger than the one in the UART, the percentage of circuitry added is lower. This is because of the large DUT in the microprocessor and shows that as a percentage of the DUT, the size of the BIST typically goes down for a larger DUT.

Design-for-test becomes a more important issue when dealing with pseudo-random vectors. Unlike ATPG, the test vectors are random. One can't specify in advance how structures will be tested. The key to testability is in creating testable logic in the first place. The same point could be made about test-point insertion. Test-point insertion becomes that much more important when faced with pseudo-random vectors.

More area considerations

It's useful to try and determine a range of expected sizes for logic BIST. This can be accomplished by examining the components.

For example, one probably wouldn't consider using any less than 8 bits in the LFSR. On the other hand, in almost every case, a 23-bit LFSR would be as high as one would need to go. With 23 bits, there are more than 8 million patterns available, but the real reason for this many bits in the LFSR is to be able to cover all the inputs without needing to reuse too many bits. Using a typical gate-level library, an 8-bit LFSR is 51 gates, and a 23-bit LFSR is 141.

For the MISR, one could set the range at 16 bits on the low side to 80 bits on the high side. (These are admittedly general limits, and serve for the purposes of illustration. Of course, you may adjust for your application.) A 16-bit MISR at the gate level gives 156 gates, and for an 80-bit, 2-to-1 reduction MISR, expect 468 gates.

As for the controller, for designs that require scan (as opposed to combinational only designs), most will require between 250 and 400 gates.

Logic BIST adds area that isn't in proportion to the size of the logic being tested. In most cases Logic BIST would add between 500 and 1,000 gates. Logic BIST can lead to coverage in the range of 75 percent to 99 percent, but isn't as efficient as deterministic (ATPG) vectors. Logic BIST depends on scan to work properly, and the testability of the overall design is a factor.

Logical conclusion

It might make sense to use logic BIST in one or a combination of the following scenarios:

1) When the design is part of a system on a chip, or a similar situation where it isn't easy or isn't possible to pin-out the I/O to the block being tested. Particularly when the chip pin count is low, this situation could occur. It stands to reason that this scenario is on the increase, with the focus on SOC design. In fact, logic BIST may enable testing on SOC designs where no other method is feasible.

2) As a deliverable with intellectual property, which may allow a user to receive a more complete solution, in that a design is completely testable "right out of the box."

3) To enable the testing of a particular chip on a particular tester. Some test programs are so large that they overflow the memory in a tester. In this case, logic BIST can be used to test for the majority of faults. Then, the tester to test for the remaining faults can apply a much smaller set of deterministic vectors.

4) When a design is "pad limited." This is when the number of pads needed on the chip is large enough that it determines the minimum size of the chip. In this situation, the addition of logic BIST circuitry wouldn't effect die size, although the additional circuitry would increase the number of transistors that have the potential of going bad.

5) To fit into an overall test scheme, such as JTAG, or in conjunction with other logic BIST or memory BIST strategies.

Cliff Warren has been with American Microsystems, Inc. for 10 years. He currently works in the memories and megacells group.

References

[1] Automated Synthesis of Large Phase Shifters for Built-In Self-Test, Januz Rajski, Jerzey Tyser ITC 1999

To voice an opinion on this or any other article in Integrated System Design, please e-mail your comments to sdean@cmp.com.

Send electronic versions of press releases to news@isdmag.com
For more information about isdmag.com e-mail webmaster@isdmag.com
Comments on our editorial are welcome.
Copyright © 2000 Integrated System Design Magazine

Sponsor Links